R Markdown

This is an R Markdown document. Markdown is a simple formatting syntax for authoring HTML, PDF, and MS Word documents. For more details on using R Markdown see http://rmarkdown.rstudio.com.

When you click the Knit button a document will be generated that includes both content as well as the output of any embedded R code chunks within the document. You can embed an R code chunk like this:

1. ggplot2 packages

1.1 ggplot2 demo

library(ggplot2)
mammals <- MASS::mammals

ggplot(mammals, aes(x = body, y = brain)) + 
  geom_point()

ggplot(mammals, aes(x = body, y = brain)) + 
  geom_point(alpha = 0.6) + 
  stat_smooth( method = "lm", color = "red", se = FALSE)
## `geom_smooth()` using formula 'y ~ x'

ggplot(mammals, aes(x = body, y = brain)) + 
  geom_point(alpha = 0.6) + 
  coord_fixed() + # forces a specified ratio between the physical representation of data units
  scale_x_log10() + 
  scale_y_log10() + 
  stat_smooth( method = "lm", color = "#C42126", se = FALSE, size = 1 )
## `geom_smooth()` using formula 'y ~ x'

一张统计图形就是从数据到几何对象(geometric(geom), 包括点、线、条形等)的图形属性(aesthetic attributes(aes), 包括颜色、形状、大小等)的一个映射

1.2 The three essential R grammatical elements

Element Description
Data The data-set being plotted.
Aesthetics The scales onto which we map our data.
Geometries The visual elements used for our data.

1.2.1 Core compentecy

Element Description
Data The data-set being plotted.
Aesthetics The scales onto which we map our data.
Geometries The visual elements used for our data.
Themes All non-data ink.

1.2.2 seven grammatical elements

Element Description
Data The data-set being plotted.
Aesthetics The scales onto which we map our data.
Geometries The visual elements used for our data.
Themes All non-data ink.
Statistics Representations of our data to aid understanding.
Coordinates The space on which the data will be plotted.
Facets Plotting small multiples.

1.3 ggplot2 package

The grammar of graphics implemented in R Two key concepts:

  1. Layer grammatical elements

  2. Aesthetic mappings

### 1. data
mammals
##                               body   brain
## Arctic fox                   3.385   44.50
## Owl monkey                   0.480   15.50
## Mountain beaver              1.350    8.10
## Cow                        465.000  423.00
## Grey wolf                   36.330  119.50
## Goat                        27.660  115.00
## Roe deer                    14.830   98.20
## Guinea pig                   1.040    5.50
## Verbet                       4.190   58.00
## Chinchilla                   0.425    6.40
## Ground squirrel              0.101    4.00
## Arctic ground squirrel       0.920    5.70
## African giant pouched rat    1.000    6.60
## Lesser short-tailed shrew    0.005    0.14
## Star-nosed mole              0.060    1.00
## Nine-banded armadillo        3.500   10.80
## Tree hyrax                   2.000   12.30
## N.A. opossum                 1.700    6.30
## Asian elephant            2547.000 4603.00
## Big brown bat                0.023    0.30
## Donkey                     187.100  419.00
## Horse                      521.000  655.00
## European hedgehog            0.785    3.50
## Patas monkey                10.000  115.00
## Cat                          3.300   25.60
## Galago                       0.200    5.00
## Genet                        1.410   17.50
## Giraffe                    529.000  680.00
## Gorilla                    207.000  406.00
## Grey seal                   85.000  325.00
## Rock hyrax-a                 0.750   12.30
## Human                       62.000 1320.00
## African elephant          6654.000 5712.00
## Water opossum                3.500    3.90
## Rhesus monkey                6.800  179.00
## Kangaroo                    35.000   56.00
## Yellow-bellied marmot        4.050   17.00
## Golden hamster               0.120    1.00
## Mouse                        0.023    0.40
## Little brown bat             0.010    0.25
## Slow loris                   1.400   12.50
## Okapi                      250.000  490.00
## Rabbit                       2.500   12.10
## Sheep                       55.500  175.00
## Jaguar                     100.000  157.00
## Chimpanzee                  52.160  440.00
## Baboon                      10.550  179.50
## Desert hedgehog              0.550    2.40
## Giant armadillo             60.000   81.00
## Rock hyrax-b                 3.600   21.00
## Raccoon                      4.288   39.20
## Rat                          0.280    1.90
## E. American mole             0.075    1.20
## Mole rat                     0.122    3.00
## Musk shrew                   0.048    0.33
## Pig                        192.000  180.00
## Echidna                      3.000   25.00
## Brazilian tapir            160.000  169.00
## Tenrec                       0.900    2.60
## Phalanger                    1.620   11.40
## Tree shrew                   0.104    2.50
## Red fox                      4.235   50.40
### 2. aesthetics
ggplot(mammals, aes(x = body, y = brain)) +  # aes means aesthetics
  geom_point() # this is geomertry

### 3. geomerty
g <- ggplot(iris, aes(x = Sepal.Length, y = Sepal.Width)) +
  geom_jitter()
g

### 4. theme

g <- g + labs(x = "Sepal Length (cm)", y = "Sepal Width (cm)") +
  theme_classic()
g

  1. ggplot2 layers ggplot2 demo

1.4 aesthetics

1.4.1 mapping on to X and Y axes

ggplot(iris, aes(x = Sepal.Length,y = Sepal.Width)) + 
  geom_point()

### 1.4.2 mapping on to color

ggplot(iris, aes(x = Sepal.Length,y = Sepal.Width, 
                 color = Species)) + 
  geom_point()

1.4.3 Mapping onto the color aesthetic in geom

ggplot(iris) + 
  geom_point(aes(x = Sepal.Length, y = Sepal.Width, color = Species))

Only necessary if:

1. All layers should not inherit the same aesthetics

2. Mixing different data sources

Aesthetic Description
x X axis position
y Y axis position
fill Fill color
color Color of points, outlines of other geoms
size Area or radius of points, thickness of lines
alpha Transparency linetype
line dash pattern
labels Text on a plot or axes shape Shape

1.5 Attributes(属性)

ggplot(iris, aes(x = Sepal.Length, y = Sepal.Width))+
  geom_point(color = "red")

ggplot(iris, aes(x = Sepal.Length, y = Sepal.Width))+
  geom_point(size=10)

ggplot(iris, aes(x = Sepal.Length, y = Sepal.Width))+
  geom_point(shape = 4)

1.6 Positions(避免重叠)

# default position='identity'
ggplot(iris, aes(x = Sepal.Length, y = Sepal.Width, color = Species)) +
  geom_point()

ggplot(iris, aes(x = Sepal.Length, y = Sepal.Width, color = Species)) +
  geom_point(position = 'identity')

### 1.6.1 postion “jitter”

ggplot(iris, aes(x = Sepal.Length, y = Sepal.Width, color = Species)) + 
  geom_point(position = "jitter")

posn_j <- position_jitter(0.1) 
ggplot(iris, aes(x = Sepal.Length, y = Sepal.Width, col = Species)) + 
  geom_point(position = posn_j)

1.7 scales*

scale_x_*()

scale_y_*()

scale_color_*()

scale_fill_*()

scale_shape_*()

scale_linetype_*()

scale_size_*()

ggplot(iris, aes(x = Sepal.Length, y = Sepal.Width, color = Species)) + 
  geom_point(position = "jitter") + 
  scale_x_continuous("Sepal Length") + 
  scale_color_discrete("Species")

ggplot(iris, aes(x = Sepal.Length, y = Sepal.Width, color = Species)) + 
  geom_point(position = "jitter") + 
  scale_x_continuous(name = "test") + # change x axes label name
  scale_color_discrete("Species") # This is legend name

ggplot(iris, aes(x = Sepal.Length, y = Sepal.Width, color = Species)) + 
  geom_point(position = "jitter") + 
  scale_x_continuous("Sepal Length", limits = c(2,8)) + 
  scale_color_discrete("Species")

ggplot(iris, aes(x = Sepal.Length, y = Sepal.Width, color = Species)) + 
  geom_point(position = "jitter") + 
  scale_x_continuous("Sepal Length", 
                     limits = c(2, 8),
                     breaks = seq(2, 8, 3), 
                     expand = c(0, 0)) +
  scale_color_discrete("Species")

ggplot(iris, aes(x = Sepal.Length, y = Sepal.Width, color = Species)) + 
  geom_point(position = "jitter") + 
  scale_x_continuous("Sepal Length", limits = c(2, 8), 
                     breaks = seq(2, 8, 3),
                     expand = c(0, 0), 
                     labels = c("Setosa", "Versicolor", "Virginica")) + 
  scale_color_discrete("Species")

ggplot(iris, aes(x = Sepal.Length, y = Sepal.Width, color = Species)) + 
  geom_point(position = "jitter") + 
  scale_x_continuous("Sepal Length", limits = c(2, 8), 
                     breaks = seq(2, 8, 3),
                     expand = c(0, 0), 
                     labels = c("A", "B", "C")) + 
  scale_color_discrete("Species")

ggplot(iris, aes(x = Sepal.Length, y = Sepal.Width, color = Species)) + 
  geom_point(position = "jitter") + 
  labs(x = "Sepal Length", y = "Sepal Width", color = "Species")

1.8 theme layer

type modified using
text element_text()
line element_line()
rectangle element_rect()
Nothing element_blank()

theme(text, axis.title, axis.title.x, axis.title.x.top, axis.title.x.bottom, axis.title.y, axis.title.y.left, axis.title.y.right, title, legend.title, plot.title, plot.subtitle, plot.caption, plot.tag, axis.text, axis.text.x, axis.text.x.top, axis.text.x.bottom, axis.text.y, axis.text.y.left, axis.text.y.right, legend.text, strip.text, strip.text.x, strip.text.y )

ggplot(iris, aes(x = Sepal.Length, y = Sepal.Width, color = Species)) + 
  geom_jitter(alpha = 0.6) + 
  theme(axis.title = element_text(color = "blue"))

theme_iris <- theme(text = element_text(family = "serif", size = 14), 
                    rect = element_blank(), panel.grid = element_blank(), 
                    title = element_text(color = "#8b0000"), 
                    axis.line = element_line(color = "black"))

z <- ggplot(iris, aes(x = Sepal.Length, y = Sepal.Width, color = Species)) + 
  geom_jitter(alpha = 0.6) + 
  scale_x_continuous("Sepal Length (cm)", limits = c(4,8), expand = c(0,0)) + 
  scale_y_continuous("Sepal Width (cm)", limits = c(1.5,5), expand = c(0,0)) + 
  scale_color_brewer("Species", palette = "Dark2", labels = c("Setosa", "Versicolor", "Virginica"))
z

z + theme_iris

2. handbook

2.1 geometries

48 geoms

abline contour dotplot ji2er pointrange ribbon spoke area
count errorbar label polygon rug step bar crossbar
errorbarh line qq segment text bin2d curve freqpoly
linerange qq_line sf tile blank density hex map
quantile sf_label violin boxplot density2d histogram path raster
sf_text vline col density_2d hline point rect smooth

2.2 shape

shape